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Abstract 



I Using a temporal version of the Copernican principle, Gott has proposed a statistical predic- 

■ tor of future longevity based on present age [J. R. Gott III, Nature 363, 315 (1993)] and applied 
the predictor to a variety of examples, including the longevity of the human species. Though 

. Gott's proposal contains a grain of truth, it does not have the universal predictive power that 

Qh! he attributes to it. 

o 

■ Returning from a five- week residence at the Isaac Newton Institute this past summer, I found 
on my desk the July 21 issue of The New Yorker, containing a provocative story by the well known 

^ I science writer Timothy Ferris ||l[. The story, entitled "How to Predict Everything," describes 

' how J. Richard Gott, a Princeton astrophysicist, makes universal probabilistic predictions for a 

phenomenon's future duration based on knowing how long the phenomenon has lasted. The jus- 
tification for Gott's rule is said to be a temporal version of the Copernican principle: when you 
observe a phenomenon in progress, your observation does not occur at a special time. 

Here is Gott's account, as related to Ferris, of how he conceived his rule while contemplating 
the Berlin Wall. 



Standing at the Wall in 1969, 1 made the following argument, using the Copernican principle. 
I said. Well, there's nothing special about the timing of my visit. I'm just travelling — you know, 
Europe on five dollars a day — and I'm observing the Wall because it happens to be here. My 
visit is random in time. So if I divide the Wall's total history, from the beginning to the end, 
into four quarters, and I'm located randomly somewhere in there, there's a fifty-per-cent chance 
that I'm in the middle two quarters — that means, not in the first quarter and not in the fourth 
quarter. 

Let's suppose that I'm at the beginning of that middle fifty per cent. In that case, one 
quarter of the Wall's ultimate history has passed and there are three quarters left in the future. 

*This work was supported in part by the U.S. Office of Naval Research (Grant No. N00014- 93-1-0116). 
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In that case, the future's three times as long as the past. On the other hand, if I'm at the other 
end, then three quarters have happened already, and there's one quarter left in the future. In 
that case, the future is one-third as long as the past. 

(The Wall was) eight years (old in 1969). So I said to a friend, "There's a fifty-per-cent 
chance that the Wall's future duration will be between (two and) two-thirds of a year and 
twenty-four years." Twenty years later, in 1989, the Wall came down, within those two limits 
that I had predicted. I thought. Well, you know, maybe I should write this up. 

Ferris goes on to recount how Gott applies his method to the longevity of the human species. 

The question that Gott has been asking lately is how long the human species is going to 
last. Since scientists generally make predictions at the ninety- five-per-cent confidence level, 
Gott begins with the assumption that you and I, having no reason to think we've been born in a 
special time, are probably living during the middle ninety-five per cent of the ultimate duration 
of our species. In other words, we're probably living neither during the first two and a half per 
cent nor during the last two and a half per cent of all the time that human beings will have 
existed. 

"77omo sapiens has been around for two hundred thousand years," Gott said .... "That's 
how long our past is. Two and half per cent is equal to one-fortieth, so the future is probably 
at least one-thirty-ninth as long as the past but not more than thirty-nine times the past. If 
we divide two hundred thousand years by thirty-nine, we get about fifty-one hundred years. If 
we multiply it by thirty-nine, we get 7.8 million years. So if our location in human history is 
not special, there's a ninety-five-per-cent chance we're in the middle ninety-five per cent of it. 
Therefore the human future is probably going to last longer than fifty-one hundred years but 
less than 7.8 million years. 

"Now, those numbers are interesting, because they give us a total longevity that's comparable 
to that of other species." 

These glib predictions astonished me, not because Gott concludes from them that homo sapiens 
is unlikely to last longer than other species — that is a legitimate subject for inquiry and debate — but 
because they are put forward as a universal rule, applicable no matter what other information one 
has about the phenomenon in question. In making statistical predictions of future longevity, Gott 
dismisses the entire process of assembling and organizing information about a phenomenon, evalu- 
ating that information critically, and if possible, formulating laws that describe the phenomenon. 
Put succinctly, he rejects as irrelevant the process of rational, scientific inquiry, replacing it with a 
single, universal statistical rule. That has to be wrong. 

I decided it was important to find the flaws in Gott's reasoning: flawed thinking is an inevitable, 
even necessary part of the scientific enterprise, but when it makes its way into The New Yorker, 
the time has come to find the fiaws and draw attention to them. I began by requesting from the 
UNM Library a copy of the Nature article where Gott proposes his rule and applies it to the 
above examples and others. A citation search turned up two other pieces in which Gott adds to the 
content of his Nature article: a Letter to Nature responding to letters criticizing the original 
article and a chapter in the proceedings of an Astronomical Society of the Pacific (ASP) Symposium 
1^]. The present paper analyzes what I found in Gott's papers and reports my conclusions. 

Gott's delta-t argument 

Gott justifies his probabilistic predictions by making what he calls the delta-t argument Q. 
Suppose there is a phenomenon that has a beginning, or birth, at time to and an end, or death, 
at time to + T, T being the duration of the phenomenon. You observe the event at a time t 
between the beginning and the end, corresponding to a present age, tp = t — to, and a future 
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duration, tf = T — tp. If there is nothing special about the observation time — this is the content of 
the temporal Copernican principle — Gott reasons that tp = t — to is a random variable uniformly 
distributed between and T. This means that tp lies between aT and bT, < a < 6 < 1, with 
probability b — a = f; in symbols, we write 

P{aT <tp<bT) = b-a = f . (1) 

Gott's next step is to infer from Eq. (|l|) that the duration T lies between the corresponding 
bounds, tp/b and tp/a, with the same probability /. Translated to future duration, this says that 
tf lies between (6^^ — l)tp and (a~^ — l)tp with probability /, i.e., 

= b-a = f. (2) 

All of Gott's predictions flow from this probability rule. I use the letter G to distinguish probabilities 
based on this rule. 

Gott phrases his predictions in terms of particular / x 100% confidence levels, which he obtains 
by letting a and b be equidistant from and 1, i.e., a = 1 — b. The resulting choices, a = ^(1 — /) 
and b = ^(1 + /), lead to Gott's confidence-level prediction: 

1-f l+f 

1 + ^ . (3) 

(/ X 100% confidence level) 

For example, in his encounter with the then {tp = )8-year-old Berlin Wall, Gott used / = 1/2, with 
a = 1/4 and b = 3/4, which led him to predict with 50% confidence that the total duration of the 
Wall would lie between 4tp/3 = 10|yr and 4tp = 32 yr or, equivalently, that the future duration 
would lie between tp/3 = 2|yr and 3tp = 24 yr. In most of his work, Gott uses a 95% confidence 
level, corresponding to / = 0.95. 

Another form of Gott's rule arises from letting 6 = 1 and a = (1 + 1^)^^. Inserting these choices 
into Eq. (^), one finds that tf < Ytp with probability Y/{1 + Y); equivalently, the probability that 
the future duration is not less than Ytp is (1 + Y)~^, i.e., 

Gitf > Ytp) = . (4) 

In his Nature article, Gott derives Eq. (Q) independently of the delta-t argument by assuming that 
the phenomenon of interest is an exponential decay [see Gott's Eq. (6) and preceding discussion]. 
There being no hint in the delta-t argument that Gott restricts his method to exponential decays, 
this derivation must be intended as an example of his method. I defer discussion of this derivation, 
since its status can be appreciated only after exposing and correcting the flaws in Gott's reasoning. 

The delta-t argument implies that Gott's rule provides a universal method for predicting the 
future duration of any phenomenon, the only assumption being that the observation time is not 
special. Moreover, it is clear from the variety of phenomena to which Gott applies his rule — 
durations of the Berlin Wall, Stonehenge, and the Soviet Union, the publication lifetime of Nature, 
longevity of the human species, and in his ASP contribution and in his conversations with Ferris, 
running times of plays in New York — that he places no restrictions on the applicability of his rule. 

It is not hard to find an error in the delta-t argument: the step from Eq. to Gott's rule ^) 
has no justification in probability theory. This error that has been pointed out by Buch, in a Let- 
ter to Nature criticizing Gott's method The total duration T (or the future duration tf) is 
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unknown and thus must be treated as a random variable described by a prior probability distri- 
bution. This prior distribution expresses whatever information one possesses that can be used to 
make probabilistic statements about the phenomenon's duration. After collecting the data that 
the phenomenon's present age is tp, the only procedure authorized by probability theory is to up- 
date the prior distribution to a new, posterior distribution for T (or ty), which reflects both the 
prior information and the present age. The formal procedure for this updating is called Bayes's 
theorem Q. 

The error just identified is sufficient to invalidate the delta-t argument. To correct it requires 
an analysis that uses Bayes's theorem to update probabilities. Indeed, Gott has endorsed Q a 
Bayesian analysis suggested by Buch this Bayesian analysis, said to be based on the temporal 
Copernican principle, leads to Gott's rule, provided one uses a particular prior distribution, dT/T, 
called the Jeffreys prior 0. The reader should be aware, however, that the Bayesian analysis 
suggested by Buch and endorsed by Gott is also flawed. In considering the Buch-Gott Bayesian 
analysis below, we will uncover this flaw, thus revealing a second error in the delta-t argument, just 
as serious as the first, but more insidious because it is more subtle: Equation ^ is an incorrect 
mathematical formulation of the temporal Copernican principle. The pay-off for identifying this 
second flaw is that it clarifies the meaning and status of the temporal Copernican principle. In 
developing a proper Bayesian analysis based on the temporal Copernican principle, we will discover 
that Gott's rule is a universal consequence of the Copernican principle, in the situation where one 
knows the phenomenon to be in progress, but does not know its present age. Not knowing the 
present age, one cannot make Gott's predictions of future duration. 

Before turning to the Bayesian analysis, however, I introduce a few examples that show that 
Gott's rule cannot be a universal predictor and also serve to put some flesh on the dry bones of the 
subsequent Bayesian analysis. 

Examples of using Gott's rule 

I advise my students to test the solution to a homework problem by considering special cases 
where the solution is already known. This common-sense technique, a good rule in scientific thinking 
and in everyday life, provides compelling evidence that Gott's predictions cannot have the universal 
validity that he attributes to them. 

• Exponential decay. Consider an atom that is excited to a metastable energy level at some 
unknown time and then decays exponentially to the ground state with a decay constant = 
(20min)~^. You come along at time t and are told that the atom is in the metastable level, 
having been excited a time tp = 15min ago. According to Gott, you can predict with 95% 
confidence that the decay will occur between tf = tp/39 = 23.1 s and tj = 39tp = 9.75 hr into 
the future; more telling is that Eq. (^) predicts that tj > Atp = 60.0 min with probability 1/5. 
These predictions contradict the defining property of an exponential decay: being informed 
that the atom is still in the excited state at time t simply resets the clock so that your 
expectations for its future decay are the same as though it had been initially excited at time t. 
Specifically, you predict that it will survive a further time tf without decaying with probability 
g-tf/T^ corresponding to 95% confidence of decay between tf = and = rln20 = 3.00r = 
59.9 min. Though the numerical discrepancies between Gott's predictions and the predictions 
of an exponential decay are important, they are only a symptom of the real problem: Gott's 
rule, by including present age in the prediction of future duration, is inconsistent with the 
very notion of an exponential decay. 

Buch m has pointed out that Gott's rule is inconsistent with the properties of an exponential 
decay. In his reply to Buch |^] and in his ASP contribution Gott admits that his method doesn't 



4 



apply to an exponential decay whose decay constant is known. Instead, he says that it applies to an 
exponential decay whose decay constant is unknown and distributed according to the Jeffreys prior 
cLt/t; this leads to the Jeffreys prior dT/T for total duration T and is not an exponential decay 
at all. Gott Q also reasserts his exponential-decay derivation of Eq. (^), to be discussed below. 
All this leaves one thoroughly confused — does Gott regard his rule as universal or not? — but his 
subsequent conversations with Ferris make clear that he does not acknowledge any restrictions 
on the use of his rule. 

• Longevity of an individual. Suppose you are going to a meeting of your book club, to be held 
at a member's house that you've never been to before. You find the right street, but having 
forgotten the street address, you choose between two houses where there is evident activity. 
Knocking at one, you are told that the activity within is a birthday party, not a book-club 
meeting. Your friendly enquiry about the age of the celebrant elicits the reply that she is 
celebrating her {tp = )50th birthday. According to Gott, you can predict with 95% confidence 
that the woman will survive between tp/39 = 1.28 years and 39fp = 1,950 years into the 
future. Since the wide range encompasses reasonable expectations regarding the woman's 
survival, it might not seem so bad, till one realizes that Eq. (^) predicts that with probability 
1/2 the woman will survive beyond 100 years old and with probability 1/3 beyond 150. Few 
of us would want to bet on the woman's survival using Gott's rule. 

One might object at this point that Gott probably didn't intend his rule to apply to an individ- 
ual's longevity, but in his ASP contribution Gott applies the rule to himself: "At the time my 
(Nature) paper was published on May 27, 1993, I was 46.3 years old, so the 95% delta-t argument 
predicted that my future longevity would be at least 1.2 years but less than 1,806 years. I have 
survived past the lower limit already and so if I don't make it past the upper limit, then that 
prediction will indeed prove correct for me!" 

• Deterministic phenomena. The best testing ground for ideas comes from extreme cases, and 
here the most extreme case is a deterministic phenomenon. Putting the example in a dramatic 
context, suppose you are captured by terrorists, who confine you to a small room. You are told 
that at some time in the next 24 hours, a timer will be set and that after it has ticked for 30 
minutes, poison gas will fill the room, killing you. You are then drugged and wake up to find 
the timer ticking and reading 20 minutes since being set. According to Gott, you can predict 
with 95% confidence that the time to release of the gas lies between tf = 20min/39 = 30.8 s 
and tj = 39 X 20min = 13.0 hr; even worse, Eq. (^) predicts that with probability 2/3 the 
time to release is lOmin or more. These reassuring predictions provide scant comfort, since 
you know you have lOmin to live. 

These examples demonstrate that Gott's rule cannot be a universal method for predicting future 
durations. If the rule has any validity, it must involve other information than the present age of a 
phenomenon. As E. T. Jaynes taught us [^, when probabilistic predictions violate one's intuition, 
the proper response is neither to accept the nonintuitive predictions without question nor to dismiss 
them out of hand, but rather to identify the information underlying the prediction. You will either 
find the information inapplicable to the situation at hand, thereby confirming your intuition and 
allowing you to discard the predictions, or you will sharpen your intuition. 

The objective of this paper is to identify the prior information that underlies Gott's rule. The 
tool is Bayesian analysis. We will discover that the temporal Copernican principle contains a grain 
of truth, but that grain of truth does not include Gott's predictions of future duration. 
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A flawed, but instructive Bayesian analysis 

Return to the general situation introduced above, that of a phenomenon with a birth time to 
and a duration T. You observe the phenomenon at time t. It is often useful to replace one or 
both quantities, to and T, by the present age, tp = t — to, and the future duration, tf = T — tp. 
In developing the Bayesian analysis, I first formulate and analyze a flawed approach, advanced by 
Buch and endorsed by Gott §, which is modeled on Gott's delta-t argument. For this purpose, 
it is most convenient to use tp and T as the primary variables. The reason for going through this 
flawed analysis is that it turns up the second error in Gott's delta-t argument. 

Your prior information about the phenomenon is expressed in a prior probability p{tp, T) dtp dT, 
the joint probability that the phenomenon has lasted a time between tp and tp + dtp at the time 
of observation and that the phenomenon will last a total time between T and T + dT. The joint 
probability density can be written as p{tp,T) = p{tp\T)w{T), where p{tp\T) is the conditional 
probability density for the present age, given a total duration T, and w{T) is your prior probability 
density for the total duration. Throughout I use upper-case letters for probabilities and lower-case 
letters for probability densities. 

Before going further, it is useful to introduce two quantities related to w{T): A(T) is the death 
rate — i.e., \{T)dT is the probability that the phenomenon, having lasted a time T, ends in the 
next dT — and Q{T) is the survival probability — the probability that the phenomenon lasts at least 
a time T. These quantities are related by 

w{T) = -'^ = Q{T)\{T) (5) 

or, equivalently, by 

Q{T) = dT'w{T') =exp(^- dT' X{T')j . (6) 

An exponential decay is characterized by a constant death rate, A(T) = Aq, in which case Q{T) = 
g-AoT ^(j.^) ^ Aoe-^o^. 

Gott's formulation of the temporal Copernican principle is the following: if there is nothing 
special about the observation time, the present age is a random variable uniformly distributed 
between and T, i.e., 

p{tp\T) dtp = dtp/T , < tp < T. (7) 
This is the probability-density version of Eq. (I). We now use Bayes's theorem, 

P{X\Y)P{Y) = P{X, Y) = P{Y\X)P{X) , (8) 

to find your posterior probability density for the total duration, given the present age: 

pitp\T)w{T) _ ( , T<tp, 



P^^^^P^- pitp) -[w{T)/Tp{tp), T>tp. 

The unconditional probability density p{tp) for the present age, which is a normalization constant 
in this expression, is given by 

p{tp) = j^dT^. (10) 

One can easily verify that Gott's rule, embodied in Eqs. @-(^), is equivalent to a posterior 
density 

, r < tp. 



^™ = ^>, T>tp. (11) 
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To get this posterior from the present analysis, one must assume the (unnormaHzable) Jeffreys 
prior, 

nj{T) = ^ . (12) 

Buch concludes that Gott's rule is unreasonable because it corresponds to an unnormalizable 
prior density. Gott Q replies (correctly, I think) that there is nothing wrong with an unnormalizable 
prior, since the posterior density for T can be normalized. He defends the Jeffreys prior as being 
the appropriate "vague Bayesian prior" to use in a situation where one initially knows nothing 
about the magnitude of the duration ^. 

Jaynes has delineated the conditions for using the Jeffreys prior, showing that it should 
be used when one's prior information is unchanged by a rescaling of the total duration, T' = 
aT (a > 0). If one's prior information is unchanged by the rescaling, then the density for T' , 
w'{T') = w{T)dT/dT' = w{T)/a, should have the same functional form as the original density, i.e., 
w'{T') = w{T'). This gives w{aT) = w{T)/a, which implies that w{T) cx 1/T. This might seem 
to be progress in identifying the information that underlies Gott's rule — use it when one has no 
prior information about time scales associated with the phenomenon — but it turns out not to be, 
because the present Bayesian analysis is wrong. The reason for presenting it is not to consider its 
consequences, but to identify where it goes wrong. 

A straightforward Bayesian analysis 

That something is wrong is made apparent by a different analysis of the same situation, this 
time a straightforward Bayesian analysis that does not invoke the Copernican principle. Your 
prior information about the total duration is expressed in the prior density w{T). You observe the 
phenomenon still to be in progress a time tp after its beginning. The conditional probability for 
this observation, given a total duration T, is if tp > T and 1 if < T. Thus Bayes's theorem 
implies, with O denoting the observation, 

P{0\T)w{T) _ f , T<tp, 



Here the normalization constant is the survival probability, i.e., P{0) = Q{tp). 



The posterior density (13) is so eminently reasonable that one could have written it down 
without using the formal apparatus of Bayes's theorem. It says that the effect of discovering the 
present age is to rule out durations shorter than the present age; your posterior expectations for 
durations longer than the present age are the same as your prior expectations, with appropriate 
renormalization. Notice that this inference updates sensibly: subsequent observations that find 
the phenomenon still in progress simply exclude a wider interval of durations. Yet putting this 
simple inference in the context of the Copernican principle apparently yields a different posterior 
density (^) for the total duration. How can that be? There's nothing wrong with the Bayesian 
inference in either analysis, so the culprit must be Gott's formulation of the temporal Copernican 
principle. Thus we arrive at the second error in the delta-t argument: the uniform density ^) 
for tp — and, by extension, Eq. ^) — is not the correct mathematical formulation of the temporal 
Copernican principle. 

Where the uniform density goes wrong is in assuming that your observation occurs while the 
phenomenon is in progress. If your observation does not occur at a special time, then it is very 
likely that it occurs before the phenomenon begins or after it has ended. Including these other 
possibilities leads to a proper Bayesian formulation of the temporal Copernican principle, which is 



consistent with the inference expressed in Eq. ( |13D 
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A proper Bayesian analysis of the temporal Copernican principle 

In formulating a proper Bayesian analysis, it is convenient to choose the birth time to and 
the total duration T as the primary variables. Your prior knowledge about these two quantities 
is incorporated in two probability densities: (i) 7 (to) gives the probability 7 (to) dt^ that the phe- 
nomenon begins between times to and to -|-dto; (ii) p{T\tQ) gives the probability p(T|to) dT that the 
phenomenon lasts a time between T and T + dT, given that it began at time to- The corresponding 
joint probability density is p{to,T) = p(r|to)7(to)- 

The temporal Copernican principle — that your observation does not take place at a special 
time — is a time-translation symmetry that restricts the form of the prior densities To say 
that your observation time is not special is to say that your prior information is unchanged if the 
entire phenomenon is displaced in time while your observation time remains fixed. To be consistent 
with this translation symmetry, your prior probability density should be unchanged by such a time 
translation; i.e., p(to, T) should be independent of the birth time to- Thus the temporal Copernican 
principle can be captured precisely in the following two statements: 

1. The phenomenon is equally likely to begin at any time. This means that 7(to) is a constant. 
In order to work with normalizable probabilities, I replace the exact symmetry with the 
approximate one that 7(to) has a constant value, 1/A, at all times within a very long time 
interval. The duration A of this very long time interval exceeds all other times relevant to 
the problem, particularly typical durations. 

2. Probabilities for total duration are independent of birth time. This means that the conditional 
probability density p(T|to) does not depend on to and can be written as p{T\to) = w{T), where 
w{T) is the probability density introduced above. 

Should you be dissatisfied with these restrictions on the prior probabilities, it means that you 
do not accept the temporal Copernican principle as applying to your prior information. Dissatis- 
faction should not be surprising, for one would not expect the Copernican principle to apply to all 
situations. The three examples introduced above illustrate considerations that arise in using the 
temporal Copernican principle. In all three examples, it is easy to accept that ignorance of the 
birth time is described by the time-translation symmetry of the temporal Copernican principle: 
the atom can be excited at any time during an interval much longer than the decay time; for the 
woman at the birthday party, the situation could be phrased in terms of an individual whose birth 
could occur at any time over a period much longer than a typical human lifetime; the timer can 
be set at any time within a 24-hour period, a period somewhat longer than the 30 minutes that 
the timer ticks. Moreover, in the cases of the atom and the poison gas, duration probabilities are 
independent of the birth time. In contrast, in the case of the longevity of an individual, the prior 
conditional probability for the individual's lifetime would depend on the time of birth. Your prior 
expectation for the longevity of an individual born, say in Britain, would depend on whether the 
individual was born in the second half of the 20th Century, at the beginning of the 19th Century, 
or 10,000 years ago, at the end of the last Ice Age. 

At time t you make your observation. In Gott's formulation the observation yields the present 
age, but we now understand that getting the present age presupposes that your observation finds 
the phenomenon in progress. The first result of the observation is simply to determine whether 
the phenomenon has not yet begun, is already over, or is in progress. Only the last of these 
possibilities, denoted by / for "in progress," is of interest to us. The conditional probability to find 
the phenomenon in progress, given a birth time to and a duration T, is 




to<t<to + T, 
otherwise. 



(14) 
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The unconditional probability to find the phenomenon in progress is given by 

/OO /"OO 
dtj dTP{I\to,T)j{to)w{T) 
-OO Jo 

t POO 

dto-fito) / dTw{T) (15) 

-OO Jt—to 

^ v ' 

= Q{t - to) 

OO 

dt'-f{t-t')Q{t') . 

The assumption that 7(^0) is constant for all times of interest means that ■j{t — t') = 1/A for all 
times t' such that the survival probability Q{t') is significantly different from zero. This allows us 
to put P{I) in the form 

P{I) = T/A , (16) 

where 



T= / dTTw{T)= / dTQ{T) (17) 
Jo "'0 

is the mean total duration with respect to the prior density w{T). The present analysis assumes 
that T is finite, which requires, for large durations T, that Q{T) go to zero faster than 1/T or, 
equivalently, that w{T) go to zero faster than 1/T^. For an exponential decay, T = Aq is the 
decay constant. Notice that the probability to find the phenomenon in progress is very small. 

Bayes's theorem gives the posterior probability density for to and T, given that the phenomenon 
is occurring: 

_ P(/|to,r)7(^o)^(r) _ \^{to)w{T)/P{I) , to<t<to + T, 
Pito, T\I) I ^ ^ to>to.to + T<t. 

lit — to is large enough in this expression that 7(to) does not have its constant value, then T > t — to 
is so large that w{T) is negligible. Thus we can again replace 7(^0) by the constant value 1/A, 
leaving 

^'^10, to>tOTto + T <t. ^ ^ 

It is instructive to consider Eq. (|l^) from a variety of perspectives. A first question asks how 
the probability density for total duration changes on learning that the phenomenon is occurring: 

p{T\I) = r dtop{to,T\I) = . (20) 



Notice that p{T\I) is biased toward longer durations than the prior density 'w{T). This is because 
the phenomenon is very unlikely to be in progress at a random time selected from the long time 
interval A, so finding it in progress prejudices you to think that it has a longer duration than your 
original expectations. 

A useful, equivalent form for Eq. (^) comes from changing variables to present age and future 
duration. The Jacobian of the transformation from {to,T) to (tp,tf) is —1, which implies that 
dto dT = dtpdtf. Hence, the probability density for present age and future duration, given that the 
phenomenon is occurring, is 

p{tp,tf\I) =p{to,T\I) = + */)/^ , > and tf > 0, ^21) 

L , otherwise. 



9 



Knowing the phenomenon is in progress is equivalent to saying that both the present age and future 
duration are nonnegative, so we can regard that condition as imphcit and omit it from subsequent 



expressions. The content of Eq. (21) is the following: if you know the phenomenon is in progress, 
but don't know its present age, you treat uniformly the split of total duration into past and future; 
more precisely, you assign the same probability, governed by w{T), to all ways of splitting T into 



past and future. That's the temporal Copernican principle. Indeed, Eq. (21) is the mathematical 
embodiment of the temporal Copernican principle for phenomena known to be in progress. 

Equation (^) has three immediate consequences that highlight the connection between the 
Copernican principle and Gott's rule. We proceed by noting that once the phenomenon is known 
to be in progress, the total duration is the sum of the present age and the future duration, i.e., 
p(T\tp, tf,I) = 5{T — tp — tf). Another application of Bayes's theorem then gives 



p{tp,tf\T, I) 



p{T\tp,tf,I)p{tp,tf\I) 
P{T\I) 



tr. 



tf) 



(22) 



This is a conditional version of Eq. (^1[), with the same content. 

An obvious consequence is that if you know the phenomenon is in progress and also know its 
total duration, then you conclude that the present age is uniformly distributed between and T: 



p{tp\T,I) 



dtfp{tp,tf\T, I) 



n/T 

10, 



tp<T, 
tp>T. 



(23) 



This is the precise statement of what Gott is trying to capture in his initial assumption (|^) about 
the present age. The starting point (0) of the flawed Bayesian analysis also asserts that tp is 
uniformly distributed between and T, but it is different from Eq. (p^) in a subtle, but crucial 
way: because p[tp\T,I) is conditioned on knowing the phenomenon is occurring, further statistical 
inference uses the conditional density p{T\I) = Tw{T)/T, instead of the prior density w{T); we 
find below [see Eq. (28)] that this is how the present Bayesian analysis comes into agreement with 
the straightforward inference of the preceding section. 

A second obvious, but important consequence of Eq. ( ^I[) is that 



b 1 
—tp <tf < — 



TJ 



dt 



^"'"-^'^'^ , , ,^ /-^^ dtr, 

p / dtfp{tp,tf T,I) = / — 

(fe-i-l)ip JaT T 



(24) 



Since the condition on future duration in the probability on the left is equivalent to aT < tp < bT, 
this is just the statement that in dividing the total duration into past and future, possibilities 
satisfying the condition are a fraction 6— a of all the possibilities. Furthermore, since the conditional 
probability (p4[) is independent of T, the same result holds no matter what the prior density for T: 



P 



1 



-tp <tf < 



-tr. 




Tj\p{T\I)=b-a . (25) 



Setting 6 = 1 and a = {1 + Y) ^ in this result yields the third consequence of Eq. (H) 

1 



P{tf > Ytp\I) 



l + Y 



(26) 



Equations (^) and ( [2q ) are precise statements of Gott's rule in the forms (|2|) and (^). Indeed, 
they look just like Gott's rule, with the crucial difference that they are conditioned on knowing the 
phenomenon is in progress, without knowing its present age. 
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We are now in the curious position of affirming that for a phenomenon known to be in progress, 
but whose present age is unknown, the temporal Copernican principle leads to universal statistical 
predictions, which are described by Gott's rule. Indeed, ah the manipulations in Gott's delta-t argu- 
ment are vahd in this situation. The down side for Gott is that this conclusion does not authorize 
his predictions: in these circumstances, Gott's rule has no power to predict future durations from 
present ages, for the simple reason that the present age is unknown. 

The results of the present Bayesian analysis make perfect sense in the three examples introduced 
above. Oddly enough, the deterministic example is the simplest: if you find the timer ticking, 
but there is nothing to indicate how long it has been ticking, it is reasonable to assign the same 
probability to all ways of dividing the 30-minute interval into past and future. 

The case of the women's longevity is a bit more complicated. If you encounter an individual, but 
are given no clue as to the individual's age, a first cut might treat the past-future split uniformly. 
For a person born in Britain, a more careful analysis would give greater weight to the future than 
to the past, because of the increase in life expectancy in this century. I have already indicated 
that the case of an individual's longevity does not fit into the temporal Copernican principle for 
just this reason. It should be emphasized that the problem is not the use of Bayesian analysis: the 
increase in life expectancy could be incorporated into a more complicated Bayesian analysis, which 
would automatically produce a bias toward the future. 

The case of the atom is particularly interesting because of Gott's exponential-decay derivation 
of Eq. (Q). If you find the atom in the excited state, but you are not told when it was excited, it 
is reasonable to assign the same probability to all ways of splitting a particular duration T into 
past and future and to weight the result by an exponential e~^°^ = e~'*'"(*p+*^\ which expresses the 
probability for duration T. The final result, properly normalized, is the probability density 
specialized to an exponential decay: 

p{tp,tf\I) = Xle-^^^'^-^'f^ . (27) 

This allows us to understand Gott's exponential-decay derivation of Eq. (^) [see Gott's Eq. (6)]: 
he starts with Eq. (|27| ) [see Gott's Eqs. (3) and (4)], from which he immediately derives Eq. (p6|), 
all without realizing that Eq. applies to an exponential decay whose present age is unknown. 
Having gone through a proper Bayesian analysis, we now understand that Eq. ( p6|) does not depend 
at all on assuming an exponential decay, but rather is a universal consequence of the temporal 
Copernican principle, valid no matter what the prior density w{T), provided the present age is 
unknown. 

The next task is to find out what happens if you do discover the present age. When you 
determine the present age of the phenomenon, your Bayesian posterior for the total duration is 
given by 

p{tp\T,I)p{T\I) _(Q , T<tp, 



p{T\t„I)- ^^'^^1^^ -[w[T)/Q{t,), T>tJ, ^28) 



POO 

p{tp\I) = / dTp{tp\T,I)p{T\I) = Q{tp)/T . (29) 
Jo 



where 

P{tp\I) = , 
/o 

This posterior density is identical to the one that emerged from the straightforward Bayesian 
analysis that wholly ignored the Copernican principle. This is as it should be, because in the 
language of this section, the straightforward Bayesian inference corresponds to first learning the 
birth time to and then discovering that the phenomenon has survived a time tp, a situation that is 
equivalent to first learning that the phenomenon is in progress and then discovering its present age. 
Once you are informed of the present age or, equivalently, of the birth time, you are at a special time, 
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the time tp since the phenomenon began. The temporal Copernican principle becomes irrelevant. 
It just gets in the way of the obvious inference expressed in Eq. 

At this point it is profitable to re-read Gott's account of his 1969 encounter with the Berlin 
Wall. If Gott had not known when the Wall was built, the logic of the first two paragraphs of 
his account would be impeccable. Under those circumstances, it would be reasonable to assign 
probability 1/2 to the encounter's occurring during the middle two quarters of the Wall's total 
history. Since he did know that the Wall was built in 1961, however, his encounter did occur at a 
special time, the time eight years after the Wall's construction. The predictions made in the third 
paragraph of his account do not follow from the argument in the first two paragraphs. Indeed, his 
posterior expectations for the Wall's duration should have been a renormalized version of his prior 
expectations, whatever those were, with durations up to eight years excluded. 

We can now give a succinct account of how Gott's delta-t argument goes awry: the first two 
steps are wrong. The step from Eq. (|l|) to Gott's rule (0) is a non-Bayesian inference having no 
justification in probability theory; just as important, Eq. (]|) is itself an incorrect expression of the 
temporal Copernican principle, because it assumes that an observation at a random time will find 
the very unlikely result that the phenomenon is in progress. In repairing these errors, we discovered 
that Gott's rule for relating future duration to present age is indeed a universal consequence of the 
temporal Copernican principle, but only in a situation — not knowing the present age — which leaves 
the rule shorn of predictive power. Gott's predictions require knowing how long a phenomenon has 
lasted, but once you obtain this information, the temporal Copernican principle no longer has any 
impact, because you are at a special time within the lifetime of the phenomenon. 

Gott's rule as a predictor 

All of Gott's predictions — from the future duration of the Berlin Wall to the longevity of the 
human species — are now detached from their original mooring in the temporal Copernican principle 
and left to float free of justification. Yet a flawed analysis might lead to reasonable predictions. 
There might be some justification for Gott's predictions other than the Copernican principle. Both 
the straightforward Bayesian analysis and the analysis based on the Copernican principle culminate 
in the same inference [Eqs. (|l^) and (p8|)]: once you know the present age, your expectations about 
total duration are the same as your prior expectations, except that durations shorter than the 
present age are excluded. Thus all questions about the applicability of Gott's predictions reduce 
to determining what prior density underlies his predictions. 

As noted above, Gott's rule follows from a posterior density 

Within the correct Bayesian analysis, this posterior comes from an unnormalizable prior density 

^aiT) = ^ ■ (31) 

This prior density, distinguished by a subscript g, corresponds to a survival probability Qg{T) = 1/T 
and to a death rate Ag(T) = 1/T. One way to characterize Wg{T) is that the characteristic time 
associated with the death rate, A~^(r), is always the same as the age T. 

The prior density Wg(T) is different from the Jeffreys prior that Gott |3|, ^ identifies with his 
predictions, the reason being that Gott uses the flawed Bayesian analysis given above. Yet within 
the Bayesian analysis using the temporal Copernican principle, Wg(T) has a scale- free status similar 
to that found by Jaynes for the Jeffreys prior. Suppose that once you know the phenomenon is 
in progress, anything else you know, coming from the prior information about T, is unchanged by 
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a simultaneous change in the scale of the past and the future. Under such a scale change, t'p = atp 
and tjr = atf, the new and old probability densities are related by 

p'{t'p,t'j\I) = p{tp,tf\I) dtpdtf/dt'pdt'f = p{tp,tf\I)/a'^ . (32) 

To say that all your information is unchanged by this scale change is to say that the old and 
new densities should have the same functional form, i.e., p'{tp,tj-) = p{t'p,t'j), which implies that 
p{atp,atf\I) = p{tp,tf\I)/a^ . Using Eq. to write this in terms of the prior density, one finds 
that w{aT) = w(T)/a'^, which implies that the prior density has the form (|3l|). 

As discussed above, the Jeffreys prior applies when your prior information about the duration, 
before any observation, is scale- invariant. Once you know the phenomenon is in progress, however, 
Wg{T) captures the notion of scale invariance, because it corresponds to invariance of p{tp,tf\I) 
under simultaneous rescaling of the past and future. In contrast, the Jeffreys prior corresponds to 
invariance of p{tp,tf\I) under rescaling of tp or tf, but not both simultaneously. 

We have now uncovered the prior information that underlies the use of Gott's rule as a predictor 
of future duration; namely, knowing that a phenomenon is in progress, you cannot identify any time 
scales associated with the phenomenon either into the past or into the future. One way of thinking 
about this is that for a phenomenon that has no time scales, discovering the present age does 
not put you at a special time in the phenomenon's history, so some consequences of the temporal 
Copernican principle survive. Whether the scale-free prior information is appropriate must be 
judged case by case; it is not a universal rule. The scale-free prior certainly does not apply to the 
three examples introduced in this article, each of which has an obvious time scale: for the atom, the 
scale is the decay time; for an individual, the scale is a typical human lifetime; for the deterministic 
phenomenon, the scale is the 30 minutes that the timer ticks. Ignoring these time scales is the 
reason that Gott's rule leads to absurd predictions for these examples. 

The examples Gott discusses at the beginning of his Nature article all have readily identifiable 
time scales that make application of Gott's rule problematic. The survival of a human institution — 
a political institution such as the government of the former Soviet Union or a cultural institution 
such as a periodical like Nature — is influenced by the 30-year time scale of a generation or by a 
typical human lifetime, since loyalty to and management of such institutions change on these time 
scales. Physical manifestations of human institutions, such as the Berlin Wall or Stonehenge, are 
influenced by these same human time scales and, in addition, by the time scale over which erosion 
leads to disintegration. 

The success of Gott's rule 

Even though there is little reason to adopt Gott's rule, he portrays his predictions as successful 
|||, ^, m, ^. Consider, for example, his 95%-confidence prediction that Nature, given its 123-year 
history of publication in 1993, would continue to publish for a period between 3.15 years and 4,800 
years. Gott would consider this prediction successful because Nature has already surpassed the 
lower bound and is very unlikely to exceed the upper bound. Yet there's the hitch: the upper 
bound is far too large; without doing any analysis, anyone could have written down a similar very 
large 95% confidence interval and achieved the same "success." To assess Gott's rule, one should 
direct attention not at the the 95% confidence predictions, but at the high probabilities the rule 
assigns to very long future durations. Gott's rule in the form predicts that with probability 
1/2, Nature will continue to publish for more than 123 years after 1993, with probability 1/5 for 
more than 492 years, with probability 1/10 for more than 1,107 years, and with probability 1/20 
for more than 2,337 years. These probabilities posit a great deal of faith in the durability of human 
institutions. 
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To make this point more quantitatively, it is useful to consider a particular form of the proba- 
bility that future duration exceeds some multiple of present age: 



P{tf>Ytp\0) = p(t >{1 + Y)tp\0 



^ r .rMr|0).^ffi^.e.p -r"".TA(r)V (33) 

This form makes clear that P{tf > Ytp\0) depends only on the death rate during the interval 
between the present age and the lower bound for longevity. For a death rate Xg(T) = 1/T, one gets 
Gott's rule. 

Now let's apply this to the example of a periodical like Nature. At start-up a new publication 
confronts a variety of short-term, rapid-death scenarios. Should it survive these initial hazards and 
become established like Nature, the next time scale it faces might be roughly a human lifetime. If 
this time scale is modeled by a constant death rate = (60yr)~^, then one finds from Eq. ( ^3[ ) 
that P{tf > Ytp\0) = e-^*f/^. For Nature, this gives predictions quite different from Gott's: for 
example, a probability 0.129 to continue publishing for more than 123 years beyond 1993 and a 
probability 2.75 x 10^^ to continue publishing for more than 492 years. 

Should these predictions seem unduly pessimistic, it is because the constant decay rate does 
not recognize a long publication record as providing evidence for continued success. A prejudice 
that success begets success can be incorporated, without discarding the time scale, by choosing, for 
example, 

1 /rx^ 



where f3 > 0. For j3 = 1, this gives a constant death rate r~^, and for /3 = 0, it gives Gott's rule. 
For intermediate values, it gives a death rate that decreases with age, but with the time scale r 
still having an effect. The resulting probability (|33|) is 



P{tf > Ytp\0) = exp 



t^f^{l + Yf-l 



(35) 



For Nature this gives, assuming /? = 1/2, a probability 0.305 to continue publishing for more 
than 123 years beyond 1993, a probability 2.90 x 10"^ for more than 492 years, and a probability 
2.05 X IQ-^ for more than 1,107 years. The point here is not the particular values nor even the 
death-rate model, but rather that there is one or more time scales, which can and should be 
incorporated in the prior distribution. 

Gott stresses the success of his predictions ||l], ^ for the 44 Broadway and off-Broadway plays 
listed in The New Yorker on 27 May 1993, the day his original Nature article was published. 
For example, Gott's 95%-confidence rule predicted that Cats, having played for 3,885 days, would 
continue to play for a period between 100 days and 415 years. Gott regards this prediction as a 
success because the production continues today, thereby surpassing the lower bound, and is unlikely 
to exceed the upper bound [|l|. Yet since Cats had run 6,263 days through 30 November 1999, 
when I determined that it was still running, the same rule predicts that with probability 1/5, it 
will continue to run for at least another 68.6 years, with probability 1/10 for at least another 154 
years, and with probability 1/20 for another 326 years. Such predictions ignore obvious time scales. 
A new production faces a variety of short- to medium-term scales, including the time to the first 
reviews, the time over which a producer is willing to back a losing production, the annual cycle 
of openings and closings, and the time over which a star performer tires of a particular part and 
moves on to other challenges. An established production like Cats, having survived these initial 
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hurdles, must deal with the decade- to generation-long scale over which taste and fashion change 
substantially and the production experiences a nearly 100% turnover of personnel. Including this 
long-term scale would temper Gott's predictions for extraordinarily long runs. 

The problems with Gott's long-term predictions show up more dramatically in phenomena, 
such as the longevity of an individual, where an initial period of low death rate is followed by 
relatively rapid extinction. We don't need a detailed model to tell us whether we should believe 
Gott's prediction, based on his age of 46.3 years on 27 May 1993, that he has a 1/3 chance to 
survive to more than 139 years old. 

There are two reasons, in my view, why Gott is able to get away with making his scale-free 
predictions for the survival of governments and plays and periodicals. First, statistical models for 
the longevity of these phenomena are not well developed, so Gott is protected from the absurdities 
that arise immediately in the three examples used in this article. Although there are readily 
identifiable time scales associated with the phenomena Gott considers, how to incorporate them 
into prior probabilities for duration has not been much investigated. There's a good reason for 
this: to assess the viability of an established government or play or periodical, readily available 
current data about the particular phenomenon in question — data such as the popularity of the 
government, the balance sheet of the play or periodical, trends in attendance at the play or the 
number of subscribers of the periodical — are far more cogent than prior information about longevity 
together with the present age. Second, the intervals that Gott finds for survival times are so wide 
that he is likely to be right, till he is forced to place bets based on the high probabilities he assigns to 
long survival times. A negative feature of such bets, however, is that the bettors might not survive 
till the bets are settled. Even for the case of human longevity, where one could easily formulate 
bets that Gott would almost certainly lose, the time scales are long enough that one might not get 
much personal satisfaction from winning. 

A way to overcome this difficulty is to bet on the survival of creatures with a shorter lifetime 
than humans, but for which data on present age and future survival are readily obtainable. For this 
purpose, I sent an e-mail on 21 October 1999 and again on 2 December 1999, to my department's 
most comprehensive e-mail alias, which includes faculty, staff, and graduate students, requesting 
information on pet dogs. The responses were compiled and checked for accuracy on 6 December; a 
notarized list of the 24 dogs, including each dog's name, date of birth, and breed, and the caretaker's 
name, was deposited in my departmental personnel file on 21 December 1999. Gott's rule predicts 
that each dog will survive to twice its present age with probability 1/2. For each of the 6 dogs 
above 10 years old on the list, I am offering to bet Gott $1,000 US, at odds of 2:1 in his favor, that 
the dog will not survive to twice its age on 3 December 1999. The reason for weighting the odds 
in Gott's favor is to test his belief in his own predictions: given the odds, his rule says that his 
expected gain, at $1,000 per bet, is $6,000; moreover, the probability that he will be a net loser 
(by losing five or more of the six bets) is 7/64=0.109. 

Discussion 

The stated objective of this article is to determine what prior information underlies Gott's rule. 
Gott proposed his rule as a predictor of future duration based on knowing the present age and 

nothing else. What we have discovered is that the actual prior information underlying Gott's rule 
is both less and more than he thought. On the one hand, Gott's rule is a consequence of the 
temporal Copernican principle for a phenomenon whose age is unknown, but this universal form 
of Gott's rule has no predictive power for future durations. On the other hand, Gott's rule as a 
predictor of future durations is a consequence of discovering the present age of a phenomenon that 
has no identifiable time scales in the past or future. 

What about the focus of Gott's Nature article, the longevity of the human species? A species's 
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survival depends on its ability to adapt to short- and long-term environmental changes produced 
by other species in its ecosystem and by climatological and geological processes. The adaptations 
are made possible by existing genetic variability in the gene pool and by random mutation. How 
homo sapiens fits into this picture is a complicated question, certainly not amenable to a universal 
statistical rule. As Ferris puts it, "... in my experience most people either think we're going to 
hell in a handbasket or assume that we're going to be around for a very long time." Both views 
are a reflection of advancing technology. The first comes from alarm at technology's increasing 
impact — changes might be so rapid that we (and certainly other) species could not adapt. The 
second comes from a belief that technology can save us — by controlling the environment or by 
making possible remarkable adaptations such as escaping our earthly environment or changing our 
genetic constitution. 

Gott dismisses all such thinking as the illusions of those who don't appreciate the power of 
the Copernican principle. He contends that everything relevant to assessing our future prospects is 
contained in the statement that we are not at a special time. This article shows that the Copernican 
principle is irrelevant to considerations of the longevity of our species. Perhaps we are still subject 
to the factors that determine the survival of other species. More likely, our survival — and the 
survival of many other species along with us — depends on what we do now and in the future. We 
better think hard about it. 
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